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Abstract 

Background: DNA methylation plays an essential role in regulating gene expression under a variety of conditions 
and it has therefore been hypothesized to underlie the transitions between life cycle stages in parasitic nematodes. 
So far, however, 5'-cytosine methylation has not been detected during any developmental stage of the nematode 
Coenorhobditis elegons. Given the new availability of high-resolution methylation detection methods, an 
investigation of life cycle methylation in a parasitic nematode can now be carried out. 

Results: Here, using MethylC-seq, we present the first study to confirm the existence of DNA methylation in the 
parasitic nematode Trichinella spiralis, and we characterize the methylomes of the three life-cycle stages of this 
food-borne infectious human pathogen. We observe a drastic increase in DNA methylation during the transition 
from the new born to mature stage, and we further identify parasitism-related genes that show changes in DNA 
methylation status between life cycle stages. 

Conclusions: Our data contribute to the understanding of the developmental changes that occur in an important 
human parasite, and raises the possibility that targeting DNA methylation processes may be a useful strategy in 
developing therapeutics to impede infection. In addition, our conclusion that DNA methylation is a mechanism for 
life cycle transition in T. spiralis prompts the question of whether this may also be the case in any other 
metazoans. Finally, our work constitutes the first report, to our knowledge, of DNA methylation in a nematode, 
prompting a re-evaluation of phyla in which this epigenetic mark was thought to be absent. 
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Background 

Developmental regulation of gene expression plays a crucial 
role in the transitions between significantly differentiated 
life-history stages, such as is the case in parasitic nema- 
todes; however, the underlying mechanisms of this gene 
regulation are poorly understood. Although DNA methyla- 
tion has been established in other organisms as an 
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important method for altering chromatin structure and reg- 
ulating the expression of genes, its contribution to nema- 
tode development has not been adequately assessed given 
that so far no 5' cytosine methylation has been identified in 
any stage of Caenorhabditis elegans [1]. Most vertebrate 
cell types have approximately 60 to 90% of the CpG dinu- 
cleotides modified to 5-methylcytosine (5mC) [2], whereas 
invertebrate genomes vary extensively in the extent of 
DNA methylation, and some genomes have undetectable 
levels of methylation [3]. Recently, technological progress 
has enabled high- resolution detection of 5mC, opening the 
way for more detailed examination of the role of DNA 
methylation in a greater variety of eukaryotic genomes [4]. 

Parasitic nematodes are a good example of the biological 
importance of developmental regulation of genes, includ- 
ing the principal agent of human trichinellosis, Trichinella 
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spiralis. This food-borne agent infects a wide variety of 
vertebrate hosts through their ingestion of meat contain- 
ing encysted muscle larvae (ML). ML are released by the 
host's gastric juices, after which they grow substantially 
and mature into sexually active adults (Ad) in the host's 
intestines. New-born larvae (NBL) are released from 
mature females and then disseminate through the blood- 
stream, invade skeletal muscles, and encyst in a collagen 
capsule to form a new generation of ML [5]. The host 
muscle cells proliferate as they are transformed into 'nurse 
cells' for the parasite [6]. The major clinical symptoms of 
trichinellosis (myopathy) derive from inflammation direc- 
ted against the encysted ML. Thus, successful nematode 
development entails a series of physically and functionally 
distinct stages that require accurate recognition of specific 
biological cues. In this way, the life cycle of parasitic 
nemotodes is distinct from that of free-living nematodes, 
such as Caenorhabditis elegans, that live in a more homo- 
geneous environment. 

Stage-specific expression has been observed for genes in 
Trichinella spp. [7] . Differential expression was especially 
obvious for genes encoding the excretory-secretory (E-S) 
proteins released from the larvae. For example, a gene 
encoding a 43-kDa glycoprotein is expressed in precapsular 
and postcapsular muscle larvae, but not in adults [8]. E-S 
proteins may therefore contribute to capsule formation [9]. 
Stage-specific gene expression may also assist parasitic eva- 
sion or forestalling of immune reactions that would inhibit 
continued transmission. Thus, how stage- specific transcrip- 
tional regulation is accomplished in these organisms might 
prove useful for understanding and preventing infection. 

Recent innovations in high-throughput sequencing have 
enabled researchers to infer methylation patterns at sin- 
gle-base resolution [10]. MethylC-seq enables methylation 
analyses with unprecedented precision, and the recently 
released draft genome sequence of T. spiralis [11] pro- 
vided us the means to evaluate the methylome of its three 
distinct stages. Our work here describes the first compre- 
hensive study that confirms the existence of DNA methy- 
lation in T. spiralis and characterizes the differential 
methylomes of the organism during these life stages. We 
further identified sets of genes whose DNA methylation 
status varied between the developmental stages. Our data 
shed light on the developmental biology of an important 
food-borne zoonosis, and our approach opens the way for 
future assessment of methylation as a mechanism of devel- 
opmental regulation in this and other metazoans that 
undergo similar life cycle transitions. 

Results 

The presence of DNA methylation in the 7. spiralis 
genome 

To understand whether T. spiralis possesses the ability to 
methylate DNA, we conducted reciprocal Blast searches 



to identify genes that might be related to known DNA 
(cytosine-5)-methyltransferases. Our data revealed the 
existence of several relevant orthologous genes annotated 
in the draft T. spiralis genome [11] (Table SI in Addi- 
tional data file 1). We found that EFV54759.1 and 
EFV58204.1 were homologous to dnmtS de novo methyl- 
transferases and to the maintenance methyltransferase 
dnmtl, respectively, in species that are known to have 
DNA methylation, such as human and mouse. Of addi- 
tional interest, T. spiralis appeared to be the only nema- 
tode, compared to 11 other nemotodes, that possessed de 
novo methylation machinery (dnmtS). The other nema- 
todes only contained orthologs to maintenance methyl- 
transferase dnmtl, including Caenorhabditis elegans. We 
also identified an ortholog to dnmt2 (EFV60295.1), but it 
was more similar to a previously identified tRNA methy- 
lase [12-14], which suggests the potential existence of 
RNA methylation in T, spiralis. We used the sequences 
of these dnmt-like proteins to reconstruct a phylogenetic 
tree (Figure 1). This analysis indicated that T. spiralis 
dnmtS was not a close relative of orthologs in its host 
mammals, suggesting that T, spiralis dnmtS did not origi- 
nate from its host through horizontal gene transfer. 
We performed PGR with reverse transcription (RT-PCR) 
and found that T. spiralis dnmt2 and dnmtS genes were 
differentially expressed among three life stages, but that 
dnmtl expression remained at about the same level (Fig- 
ure Sla in Additional data file 2). Correspondingly, enzy- 
matic data using nuclear protein extracts also showed 
differential catalytic activity of the T. spiralis dnmts 
(Figure Sib in Additional data file 2). We also carried out 
ultra-performance liquid chromatography-tandem mass 
spectrometry (UPLC-MS/MS), which further confirmed 
the existence of DNA methylation in T. spiralis, showing 
that the total amount of DNA methylation in the Ad 
stage was significantly higher than in the NBL stage (Fig- 
ure S2 in Additional data file 2) [15]. 

Given these results, we assessed the genome-wide 
DNA methylation profiles in the three life stages of 
r. spiralis (Ad, ML, and NBL) using MethylG-Seq. We 
generated 61.65, 23.52 and 55.77 million raw reads, 
respectively. We aligned the reads to the T, spiralis 
reference sequence [16] and mapped approximately 
96.36% of the reads to Ad, 91.30% to ML and 99.27% to 
NBL, yielding 2.91, 1.05 and 2.71 Gb of DNA sequence 
to Ad, ML, and NBL, respectively. The average read 
depth was 21.36, 10.80 and 26.21 per strand, respec- 
tively. On average, over 81.6% of each strand of the 64 
Mb T. spiralis reference sequence was covered by at 
least one sequence read in each of the three stages. 
Because of the potential for the occurrence of non- 
conversion and thymidine-cytosine sequencing errors, 
we estimated the false-positive rate as the percentage of 
cytosines sequenced at cytosine reference positions in 
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Figure 1 Phylogenetic tree of dnmt proteins. Multiple sequence alignment was performed by ClusterW, then ClusterW with the neighbor- 
joining method based on G (Jones-Taylor-Thornton and Gamma Distribution) model was applied to reconstruct the phylogenetic tree. The 
species with best hits to T. spiralis dnmts were used as representatives that span the phylum and were analyzed in this study. 



the Lambda genome, which are normally unmethylated 
(Materials and methods). We then applied the error 
rates for each stage (0.0060, 0.0064 and 0.0025 for Ad, 
ML, and NBL, respectively) to correct mC identification 
according to a method described by Lister et aL [10] 



that is based on a binomial test and false discovery rate 
constraints. Corrected estimates resulted in approxi- 
mately 0.31 million and 0.24 million mCs in the Ad and 
ML genomes (comprising 1.59% and 1.22% of their 
sequenced cytosines, respectively). In contrast, methylation 
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was nearly undetectable in NBL (0.002 million; 0.01%; 
Table S2 in Additional data file 2). We validated the 
results using two different methods: (1) bisulfite-PCR 
(BSP), cloning, and conventional sequencing by the Sanger 
method; and (2) methylated DNA immunoprecipitation 
(MeDIP) combined with quantitative PGR (QPCR). For 
BSP, we assessed six randomly selected genomic regions 
that varied in their estimated amount of methylation, and 
obtained strong agreement between the two experimental 
results (P-value < 0.05 using double ^-test; Figure S3 in 
Additional data file 2; Table S3 in Additional data file 1) 
[15]. For MeDIP with QPCR, we assessed three randomly 
selected genomic regions and confirmed the existence of 
DNA methylation in all three regions (Figure S4 in Addi- 
tional data file 2). 

Characterization of overall methylation patterns in the 
three life stages 

We further characterized the global patterns of DNA 
methylation in the genomes of the different T. spiralis 
stages. The highest amount of detected mCs (82.25% 
and 89.06%) in Ad and ML were located in CG regions, 
indicating a dominant role of CpG methylation in these 
stages. Due to the very low level of DNA methylation in 
NBL, the distribution of CG and non-CG methylation 
was very similar to the background (Figure 2a). The 
average methylation level of specific cytosine residues 
could be estimated from the fraction of methylated 
sequence reads at that site. Here we found that the aver- 
age methylation level of specific cytosine residues was 
estimated from the fraction of methylated sequence 
reads at that site (Figure 2c). 

Since the mCs in the T. spiralis genome are relatively 
sparse compared to vertebrate genomes, we identified 
methylation regions (MRs) of the genome using relatively 
dense mCs (Materials and methods). Different CG and 
non-CG methylation might be subject to distinct forms 
of genetic control; therefore, MR identification was per- 
formed independently for CG and non-CG contexts. 
Across the genome, we observed an increase in CG 
methylation as the parasites matured from the NBL to 
the ML stage and, to a lesser extent, in the transition 
from ML to Ad. In addition, CG methylation levels fluc- 
tuated drastically across the genome, indicating a mosaic 
methylation pattern where relatively dense methylated 
domains are interspersed with regions that are not 
methylated (Figure 2b). Such a pattern has been observed 
in previous studies on other invertebrates [3] . In contrast, 
we identified only a small number of non-CG MRs 
(Table S4 in Additional data file 1). 

In all types of genomic elements, we saw a methyla- 
tion increase from NBL to ML and from ML to Ad as 
well as the global pattern. We then examined patterns 
of methylation in distinct genomic elements, including 



genes, tandem repeats, and transposable elements. 
Genes were methylated more frequently than the gen- 
ome average (Figure 3a, b). Within genes, the coding 
sequences were more methylated than the flanking 
DNA or promoter regions, while introns were the least 
methylated (Figure 3c, d). Notably, repeat elements, 
including tandem repeats and transposable elements, 
exhibited much higher DNA methylation than the genome 
average (Figure 3a, b). Previous studies have indicated that 
the methylation level of transposons across different phy- 
logenetic units may vary. It has been reported that trans- 
posable elements are highly methylated in mammals, 
plants and zebrafish {Danio rerio), and moderately methy- 
lated in Ciona intestinalis, but are usually unmethylated in 
the honey bee Apis mellifera and silkworm Bombyx mori 
[17,18]. In T, spiralis, we observed higher methylation on 
transposons relative to the immediate flanking regions as 
well (Figure S5 in Additional file 2), which is similar to 
what is seen in Ciona intestinalis [17]. 

The relationship between stage-dependant methylation 
and gene expression 

We evaluated differential gene expression among the three 
life stages using Illumina high-throughput RNA-seq tech- 
nology. Most of the raw reads (numbering 28,662,704, 
26,128,346 and 28,962,820, respectively, for the Ad, ML 
and NBL stages) could be uniquely mapped to previously 
annotated genes (62.26%, 64.38%, and 64.34%). We 
detected 12,675, 12,683 and 12,909 annotated genes out of 
the total 16,379 with at least one unique read. The major- 
ity of these genes (11,636) were expressed in all three life 
stages, and we saw 234 Ad-stage specific, 183 ML-specific 
and 445 NBL-specific genes. Of note, we also detected 
stage-dependent expression of methyltransferases that 
were concordant with prior RT-PCR results (Figure S2 in 
Additional data file 2). Finally, among genes that were 
expressed in more than one stage, we identified differential 
expression in 1,752 pair-wise comparisons (Table S5 in 
Additional data file 1). 

We characterized the changes in DNA methylation 
among the three distinct life stage methylomes and the 
relationship between methylation and differential gene 
expression. For this, we divided expressed genes with at 
least one sequencing read into quartiles of expression 
levels, and examined the expressed genes together with 
another category composed of genes exhibiting no 
expression. We found that DNA methylation levels of 
gene upstream regions had a negative correlation with 
gene expression levels, and non-expressed genes in parti- 
cular had different patterns of DNA methylation as the 
methylation levels in their upstream regulatory regions 
were higher than in the coding sequences (Figure 4a, b). 
Based on this, it is likely that methylated promoters 
induce silencing in T, spiralis, akin to the widely accepted 
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Figure 2 DNA methylation patterns and chromosomal distribution of three life stages of T. spiralis, (a) The fraction of mCs identified in 
eacli sequence context in tine tliree life stages in comparison witli tine fraction of all Cs in each sequence context in the genome, (b) 
Distribution of MRs identified on the two DNA strands (Watson and Crick) throughout the whole genome. The value refers to the average 
percentage of methylation of the MRs, as shown on the y-axis. (c) Distribution of mCs (y-axis) across the percentage methylation levels (x-axis). 



role for hypermethylation of promoters as a means of 
repressing gene expression in plants and mammals 
[19,20]. In contrast, the gene-body methylation levels in 
our analysis showed a bell-shaped, rather than mono- 
tonic, relationship with gene expression levels. Generally 
in the gene body, the non-expressed and most highly 
expressed genes had the lowest DNA methylation levels, 
whereas the mid-level expressed genes had the highest 



percentage of DNA methylation (Figure 4a-c). A bell- 
shaped relationship between gene-body methylation and 
expression levels has been observed previously in plants 
{Arabidopsis thaliana and Oryza sativa), invertebrates 
{Ciona intestinalis and Nematostella vectensis), and 
humans as well [4,21,22], indicating conservation of the 
role of methylation across phylogenetically diverse 
species. 
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Figure 3 Average methylation levels of different genomic regions of the three life stages of T. spiralis, {a, b) Average density of 
methylation levels (a) and percentage of methylation levels (b) (y-axis; Materials and methods) at different functional regions (x-axis). (c, d) 
Average density of methylation levels (c) or percentage of methylation levels (d) (y-axis) of intervals around genie regions (x-axis). Two-kilobase 
regions upstream and downstream of each gene were divided into 100-bp (bp) intervals. Each coding sequence or intron was divided into 20 
intervals (5% per interval). 



Biological implications of stage-dependant methylation in 
T. spiralis 

We examined the correspondence of methylation status 
to divergent gene expression in different stages. Due to 
the overall low level of genome methylation in general, 
we limited this analysis to MRs exhibiting high levels of 
methylation where we had at least 5x depth of coverage. 
Using this criteria, we found a total of 652 ML and 767 
Ad MRs enriched for methylation in CG regions, but 
MRs in non-CG regions were rare. In contrast, we 
found no MRs in the NBL stage. As shown in Figure 5a, 
389 MRs were shared between Ad and ML. These MRs 
were located in 486 and 551 genes in the ML and Ad 
stages, respectively, with the majority located in gene- 
body regions (Table S4 in Additional data file 1). 



We carried out a Gene Ontology analysis to function- 
ally characterize those genes with CG MRs in Ad and 
ML using GOstat [23]. Enrichment of GO terms defined 
by a significant false discovery rate-corrected P-value 
(<0.01) in the 'molecular function' category was indicated 
in 'DNA integration', 'DNA metabolic process' and so on. 
In the 'biological process' category, 'nucleic acid meta- 
bolic process' and endopeptidase activity' and so on were 
enriched. Of note, we found that many genes were shared 
among different molecular pathways and constituted a 
central focus of study in parasitic nematodes (Table S6 in 
Additional data file 1). Given this, we explored the poten- 
tial for DNA methylation regulating genes that are 
related to parasitic activities. For example, the protein 
EFV53263.1 is encoded by a DNase II gene in the 'DNA 
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metabolic process' category that is expressed in a stage- 
specific manner and important for T. spiralis parasitism 
[24], and we found that this gene was uniquely tran- 
scribed in the NBL stage, whereas it was not expressed 
and had hypermethylated promoter regions in the Ad 
and ML stages. Considering the globally monotonic and 
negative correlation between promoter methylation and 
gene expression, this finding strongly suggested that 
promoter methylation plays a role in the regulation of 
stage-specific expression of certain DNase II genes in 
T. spiralis (Figure 6a). 

In addition to a role for promoter methylation in expres- 
sion, recent studies have suggested that gene-body methy- 
lation is involved in alternative splicing regulation. Our 
RNA-seq results indicated that many T. spiralis genes 
were alternatively spliced (Figure S6 in Additional data file 
2), and in relation to this, we saw a steep change in the 
methylation frequency across splice junctions on both 
sense and antisense strands in the MR genes (Figure 6b). 
Moreover, there were considerable methylation differences 
between skipped and constitutive exons or between 
retained and spliced introns (Figure 6c). Particularly pro- 
minent was also the 'infiltration of methylation' in intronic 
sequences of neighboring splice junctions (Figure 6b, c). 
These results are in agreement with previous studies 



[25-27]. Taken together, our data suggest that DNA 
methylation status has a relationship with the donor/ 
acceptor sequence context around the splice junctions, 
indicating the potential influence of DNA methylation on 
alternative splicing of MR genes. 

Discussion 

The status of DNA methylation in nematode genomes 
has been unclear. Research on Caenorhabditis elegans, 
which reportedly lacked mC in age-synchronous senes- 
cent populations, indicated it had negligible influence [1]. 
With regard to Caenorhabditis elegans, our computa- 
tional searches here indicate that it has dnmtl, but not 
dnmt2 [12] or dnmtS. As dnmtS is essential for de novo 
methylation [28], the lack of dnmtS in Caenorhabditis 
elegans might explain the absence of de novo methylation 
in this nemotode. In contrast, we identified three methyl- 
transferase genes in the T, spiralis genome that were 
orthologs to known dnmts in vertebrates, including 
dnmtl, dnmt2 and dnmtS. Intriguingly, of 11 species of 
nematodes tested, T. spiralis appeared to be the only 
nematode possessing de novo methylation machinery 
(dnmtS). Moreover, we found that expression levels of 
dnmt2 and dnmtS increased during development from 
larvae to adulthood, but that dnmtl did not. We note 
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Figure 6 Stage-dependant DNA methylation in relation to gene repression and alternative splicing (a) Graphical representation of 
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that our analysis indicated dnmt2 was more similar to a 
tRNA methylase instead of a DNA methyltransferase 
[12-14], which might indicate that RNA methylation also 
plays a role in T. spiralis development, and is therefore 
worth more detailed analysis in the future. 

We further carried out the first comprehensive, high- 
resolution analysis of methylation in T. spiralis, to assess 
the intriguing possibility, given the presence of de novo 
DNA methyltransferase orthologs, that epigenetic control 
might help govern the development of its distinct life his- 
tory stages via temporally regulated gene expression. 
Methylome sequencing revealed a mosaic methylation pat- 
tern in T. spiralis, typical of other invertebrates [29,30]. 
DNA methylation increased drastically during maturation 
from NBL to ML, and adults exhibited the highest 
observed DNA methylation level. This finding contrasts 
with a trend seen in some other species where methylation 
patterns remain stable throughout the life cycle [29]. For 
instance, in the sea urchin, which also has distinct life 
stages, methylated and non-methylated regions in its gen- 
ome retain the same overall methylation composition 
throughout all tested stages [31]. The relative overall con- 
stancy of methylation patterns is also a feature of verte- 
brate genomes. However, the extent of DNA methylation 
may reflect changes in both intrinsic and environmental 
exposure [32]. For instance, studies in humans indicated 
that total genomic 5-methylcytosine has been found to 
typically decrease during aging [33,34], in concordance 
with declining Dnmtl activity with age [35]. Parasites such 
as T. spiralis certainly undergo more drastic lifespan 
changes in response to environmental cues, that is, meta- 
morphosis critical to their survival and reproduction. Our 
findings here provide evidence to indicate that these DNA 
methylation changes might play an important role in regu- 
lating such transformations in T. spiralis. 

Previous studies have indicated that methylation may be 
an evolutionarily ancient means of transcriptional control 
as it is maintained in phylogenetically diverse lineages. In 
both plants and vertebrates, the notion that methylation in 
promoters primarily represses genes by impeding tran- 
scriptional initiation has been widely accepted [19,20], 
whereas intermediate levels of expression have been asso- 
ciated with genes experiencing the greatest extent of 
methylation in the gene body, indicating a bell-shaped 
relationship [4,21,22,36]. However, in the fungus Neuro- 
spora crassa [37] and the silkworm Bombyx mori [18], 
transcription initiation is unaffected. Thus, DNA methyla- 
tion shows remarkable diversity in its extent and function 
across eukaryotic evolution. Here, our results indicate that 
the presence of promoter methylation correlates with 
reduced gene expression levels. Promoter hypermethyla- 
tion may regulate a portion of stage-specific genes by 
repressing their transcription initiation in non-expressed 



stages, as exemplified by a NBL-specific DNase II gene 
(Figure 6a). Our assessment of gene-body methylation, 
which had a bell-shaped relationship between methylation 
and gene expression, indicates there was no overt relation- 
ship between expression and methylation levels. We did, 
however, see evidence for a relationship between methyla- 
tion within the gene-body and alternative splicing of these 
genes in T. spiralis, indicating these regulatory mechan- 
isms of gene and protein activity are an area of interest for 
future study as well; and the presence of a tRNA methy- 
lase ortholog makes further study of RNA regulation in 
general in the organism of interest. 

In relation to the notion that complex regulatory 
machinery of DNA methylation has been developed in 
T, spiralis with species-specific characteristics, probably 
in response to environmental cues, we found that many 
of the MR genes are enriched in pathways that are func- 
tionally important for parasitic nematodes. Such genes 
modulate the interaction between the parasite and its 
host so as to protect the parasite against host immune 
responses. There was also enrichment in pathways that 
are important to parasitic activity, including previously 
reported catalytically active E-S proteins. Of note, 
hydrolases are among the most abundant proteins 
secreted by parasites and facilitate host tissue invasion 
[38]. Also important to T, spiralis is the conversion of 
muscle cells to nurse cells, and DNA-binding proteins 
[39], which are often affected by methylation changes, 
are believed to interfere with host cell signaling in ways 
that promote this conversion. Additionally, many such 
proteins are encoded by large, developmentally regulated 
gene families and assume different isoforms, which is 
also relevant to our findings that MRs were primarily 
distributed in gene bodies rather than promoter regions 
and the DNA methylation status was related to the 
donor/acceptor sequence context around their splice 
junctions. 

Conclusions 

We describe the first comprehensive study confirming 
the existence of DNA methylation in three life stages of 
T. spiralis. Our data also provide support for DNA 
methylation being associated with the regulation of 
genes that are closely related to the parasitism of the 
organism. In this context, in T, spiralis and in other 
organisms that experience discrete and highly specia- 
lized development forms, further consideration should 
be given to mechanisms where DNA methylation is 
involved in suppression of spurious transcriptional 
initiation of infrequently transcribed genes, promotes 
transcriptional termination, or mediates alternative spli- 
cing, as has been shown for other model organism 
systems. 
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Materials and methods 

Collection of 7. spiralis muscle larvae, adults and new- 
born larvae 

Infective T. spiralis ML were obtained from infected 
mice at 35 days post-infection by digestion of minced 
skeletal muscle in 1% pepsin and 1% HCl for 45 minutes 
at 42°C with agitation, as previously described [40]. 
Seventy male 6-week-old Wistar rats were then orally 
inoculated with a dose of 8,000 infective ML. Adult 
worms (Adl) were obtained from the intestine of ten 
rats at 30 h post-infection. The remaining 60 rats were 
sacrificed at 6 days post-infection, and the adult worms 
(Ad6) were recovered and incubated in Iscove's Modi- 
fied Dulbecco's Medium (IMDM) in 75-cm^ cell culture 
plates at 37°C. Newborn larvae were harvested every 
6 h. All experiments were performed in accordance with 
the Guide for the Care and Use of Laboratory Animals 
published by the National Institutes of Health (publica- 
tion no. 85-23, revised 1996). The protocol was 
approved by the Ethical Committee of the Institute of 
Zoonosis, Jilin University, China (reference number 
20080106). 

Enzymatic activity analysis of Dnmts 

To test the dnmt enzymatic activity of T. spiralis, 11 (ig 
of nuclear extracts for each assay were incubated in 37°C 
for 2 h using a EpiQuik™ DNMT Activity/Inhibition 
Assay Ultra Kit (Epigentek, Farmingdale, NY, USA) 
according to the manufacturer's instructions. 

BlastP searches and phylogenetic analysis of Dnmts 

Reciprocal BlastP comparisons were first performed to 
identify dnmt orthologs. Significant hits were defined as 
those satisfying the following criteria: E-value < 10'^ and 
the aligned segments covering at least 30% of the 
sequence length of the hit. For phylogenetic analysis, 
multiple sequence alignment was performed by Clus- 
terW [41]. The ClusterW with the neighbor-joining 
method [42] based on JTT+ G (Jones-Taylor-Thornton 
and Gamma Distribution) model was applied to recon- 
struct the phylogenetic tree. 

UPLC-MS/MS analysis of global DNA methylation 

UPLC-MS/MS analysis was performed according to a 
previously published method [43]. Genomic DNA 
(0.2 (ig) extracted from Ad and NBL was digested with 
lU DNase I, 2U Alkaline Phoaphatase, Calf Intestinal 
(CIP) and 0.005U snake venom phosphodiesterase I at 
37°C for 24 h. A microcon centrifugal filter device with 
a 3,000 D cutoff membrane was used to remove protein 
from the digested DNA samples by centrifuging at 
12,000 rpm for 60 minutes. The mobile phase, consist- 
ing of 5.0% methanol and 95% water (plus 0.1% formic 
acid), was used for UPLC separation of the nucleotides 



at a flow rate of 0.25 ml/minute. Enzymatically digested 
DNA samples (10 [i\ each) were injected for UPLC-MS/ 
MS analysis and each run took 10 minutes. Mass spec- 
trometry conditions were as follows: ionizationmode, 
ESI-positive; capillary voltage, 3,500 V; nitrogen drying 
gas temperature, 300°C; drying gas flow, 9 L/min; nebu- 
lizer, 40 psi. For MS/MS analysis of nucleotides, the 
fragmentor voltage was 90 V, collision energy was per- 
formed at 5 eV and scan time was 100 ms. Multiple- 
reaction monitoring (MRM) mode was used for the 
UPLC-MS/MS analysis by monitoring transition pairs of 
m/z 242.1/126.0 corresponding to 5mdC. The isotope 
labeled internal standard (5mdC-d3) was used to quan- 
tify genomic DNA methylation level, whose m/z was 
245.4/129.0. 

MethylC-seq library construction and sequencing 

Prior to library construction, 5 [ig of genomic DNA spiked 
with 25 ng unmethylated Lambda DNA (Promega, Madi- 
son, WI, USA) was fragmented using a Covarias sonica- 
tion system to a mean size of approximately 200 bp. After 
fragmentation, libraries were constructed according to the 
lUumina Pair-End protocol with some modifications. 
Briefly, purified randomly fragmented DNA was treated 
with a mix of T4 DNA polymerase, Klenow fragment and 
T4 polynucleotide kinase to repair, blunt and phosphory- 
late the ends. The blunt DNA fragments were subse- 
quently 3' adenylated using Klenow fragment (3'-5' exo-), 
followed by ligation to adaptors synthesized with 5'- 
methylcytosine instead of cytosine using T4 DNA ligase. 
After each step, DNA was purified using the QIAquick 
PCR purification kit (Qiagen, Shanghai, China). Next, 
a ZYMO EZ DNA Methylation-Gold Kit™ (ZYMO 
Research, Irvine, CA, USA) was employed to convert 
unmethylated cytosine into uracil, according to the manu- 
facturer s instructions, and 220 to 250 bp converted pro- 
ducts were size selected. Finally, PCR was carried out in a 
final reaction volume of 50 (il, consisting of 20 [i\ of size- 
selected fractions, 4 \i\ of 2.5 mM dNTP, 5 [d of lOx buf- 
fer, 0,5 \i\ of JumpStartTM Taq DNA Polymerase, 2 \i\ of 
PCR primers and 18.5 \i\ water. The thermal cycling pro- 
gram was 94°C for 1 minute, 10 cycles of 94°C for 10 s, 
62°C for 30 s, 72°C for 30 s, and then a 5-minute incuba- 
tion at 72°C, before holding the products at 12°C. The 
PCR products were purified using the QIAquick gel 
extraction kit (Qiagen). Before analysis with lUumina 
Hiseq2000, the purified products were analyzed by the 
Bioanalyser analysis system (Agilent, Santa Clara, CA, 
USA) and quantified by real time PCR. Raw sequencing 
data were processed using the lUumina base-calling 
pipeline (lUumina Pipeline vl.3.1). The sodium bisulfite 
non-conversion rate was calculated as the percentage of 
cytosines sequenced at cytosine reference positions in the 
Lambda genome. 
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RNA-sequencing and real-time PGR validation 

Total RNA was extracted using the Invitrogen TRIzol® 
Reagent and then treated with RNase-free DNase I 
(Ambion, Guangzhou, China) for 30 minutes. The integ- 
rity of total RNA was checked using an Agilent 2100 
Bioanalyser. cDNA libraries were prepared according to 
the manufacturer s instructions (Illumina). The poly(A)- 
containing mRNA molecules were purified using Oligo 
(dT) Beads (Illumina) from 20 [ig of total RNA from each 
sample. Tris-HCl (10 mM) was used to elute the mRNA 
from the magnetic beads. To avoid priming bias when 
synthesizing cDNA, mRNA was fragmented before the 
cDNA synthesis. Fragmentation was performed using 
divalent cations at an elevated temperature. The cleaved 
mRNA fragments were converted into double-stranded 
cDNA using Superscript II, RNaseH and DNA Pol I, 
primed by random primers. The resulting cDNA was 
purified using the QIAquick PGR Purification Kit (Qia- 
gen). Then, cDNA was subjected to end repair and phos- 
phorylation using T4 DNA polymerase, Klenow DNA 
polymerase and T4 Polynucleotide Kinase (PNK). Subse- 
quent purifications were performed using the QIAquick 
PGR Purification Kit (Qiagen). These repaired cDNA 
fragments were 3'-adenylated using Klenow Exo- (Illu- 
mina) and purified using the MinElute PGR Purification 
Kit (Qiagen), producing cDNA fragments with a single 
'A' base overhang at the 3' end for subsequent ligation to 
the adapters. Illumina PE adapters were ligated to the 
ends of these 3'-adenylated cDNA fragments and then 
purified using the MinElute PGR Purification Kit (Qia- 
gen). To select a size range of templates for downstream 
enrichment, the products of the ligation reaction were 
purified on a 2% TAE- Certified Low-Range Ultra Agar- 
ose (Bio-Rad, Hercules, GA, USA). cDNA fragments 
(200 ± 20 bp) were excised from the gel and extracted 
using the QIAquick Gel Extraction Kit (Qiagen). Fifteen 
rounds of PGR amplification were performed to enrich 
the adapter-modified cDNA library using primers com- 
plementary to the ends of the adapters (PGR Primer PE 
1.0 and PGR Primer PE 2.0; Illumina). The 200 ± 20 bp 
PGR products were purified using QIAquick Gel Extrac- 
tion Kit (Qiagen), using the MinElute spin columns (Qia- 
gen). Finally, after detection on an Agilent Technologies 
2100 Bioanalyser using the Agilent DNA 1000 chip kit 
and quantification on a StepOne plus qPGR (ABI, Wood- 
lands, Singapore), the cDNA library products were 
sequenced using the Illumina Genome Analyser. Real- 
time PGR validation was conducted using the Maxima® 
SYBR® Green qPGR Master Mix kit (Fermentas, Beijing, 
Ghina), according to the manufacturer's instructions, in 
an ABI Prism 7500 Sequence Detection System machine 
(Applied Biosystems Inc., GA, USA). All real-time RT- 
PGR data were normalized to the NBL stage (see Addi- 
tional data file 1 for primer information). 



Sequence alignment of MethylC-seq 

The reads generated by Illumina sequencing were aligned 
onto the T. spiralis reference genome [11]. The Lambda 
genome was also included in the reference sequence as 
an extra chromosome so that reads originating from the 
unmethylated control DNA could be aligned. Because 
DNA methylation has strand specificity, the plus strand 
and the minus strand of the T. spiralis genome were 
separated to form alignment target sequences. To do 
this, each cytosine in the genome was converted to a thy- 
mine, termed the T-genome, which represented the plus 
strand. Meanwhile, each guanine in genome sequences 
was converted to adenosine, termed the A-genome, 
which represented the minus strand. Additionally, the 
original forms of the reads were also transformed to deal 
with the bisulfite treatment nucleotide conversion in the 
alignment process. First, the observed cytosines on the 
forward read of each read pair were replaced in silico by 
thymines, and secondly, the observed guanines on the 
reverse read of each read pair were replaced in silico by 
adenosines. We then mapped the 'alignment form' reads 
to the 'alignment form' target sequence using SOAPa- 
ligner with default parameters [16]. Every hit of a single 
placement with a minimum number of mismatches and a 
clear strand assignment was defined as an unambiguous 
alignment, and each alignment was used for mG 
ascertainment. 

Gene annotation and functional analysis 

For gene annotation, the BLAST algorithm was applied 
to further annotate the genes defined in the available 
T. spiralis genome annotation because the current anno- 
tation is incomplete. All the predicted protein sequences 
of T. spiralis genes were aligned using BLAST with 
known annotated protein sequences from three data- 
bases, including SWISS-Prot, TrEMBL and InterPro. A 
cutoff E-value < le-05 was applied for annotation, and a 
best alignment term for each query protein sequence 
was selected if more than query sequence was aligned 
based on this cutoff E-value from BLAST. 

For function analysis, GO analysis was performed 
based on the annotated genes by GOstat software [23]; 
8,286 genes with annotation out of 16,380 genes were 
used as background, and 287 and 242 genes with anno- 
tation out of the 540 and 454 genes containing MRs in 
Ad and ML, respectively, were used as input genes. 
Fisher s exact test was performed and the P-values gen- 
erated for each GO category were adjusted according to 
the Benjamini and Hochberg correction method. 

Identification of methylcytosines and methylation regions 
and determination of methylation level 

For mG identification, we transformed each aligned read 
and the two strands of the T. spiralis genome back to 
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their original forms to build an alignment. In the unique 
part of the genome, cytosines that were covered by cyto- 
sines from reads on the same strand or guanines from 
those on the opposite strand (hereafter, referred to as 
ascertainment bases) were called as potential methylated 
sites. To account for the inefficient conversion of the 
bisulfite treatment and for sequencing errors, a correc- 
tion method based on binomial tests and false discovery 
rate constraints [10] was applied to the data to build a 
high-quality methylome for each of the three stages. The 
probability p in the binomial distribution B(n, p) was esti- 
mated from the number of cytosine bases sequenced in 
reference cytosine positions in the unmethylated Lambda 
genome (referred to as the error rate: non-conversion 
plus sequencing error frequency). The bisulfite conver- 
sion rates for all samples were over 99%, and the error 
rates were as follows: Ad, 0.0060; ML, 0.0064; NBL, 
0.0025. Then, the mCs from the binomial distribution 
analysis were selected for determination of MRs, which 
were defined as being under a threshold of more than 
five continuous mCs (either CpG or non-CpGs in at least 
one strand) and having a distance between adjacent mCs 
of less than the median value (18 bp) of all distance 
values. Stage-specific MRs were defined as containing 
more than five continuous mCs and no overlap between 
two samples. 

Percentage methylation was computed as the fraction 
of reads number of 'C in the total reads number of 'C 
and 'T' for each covered CpG site, and herein average 
percentage methylation of all cytosine residues for any 
genomic region covered was computed as the fraction 
of reads number of 'C' in the total reads number of 'C' 
and 'T' for each genomic region. Density methylation 
(mC/C) was determined as the number of mCs divided 
by total number of C sites in any genomic region. 

Validation of DNA methylation 

Two strategies were applied to validate the methylation 
status of randomly selected genomic regions. BSP com- 
bined with cloning Sanger sequencing. BSP primers were 
designed by the online MethPrimer software (Additional 
data file 1). Genomic DNA (500 ng) was converted using 
the ZYMO EZ DNA Methylation- Gold Kit^^ according to 
the manufacturer's instructions. PGR amplification was 
carried out with a thermal cycling program of 94°C for 
1 minute, 30 cycles of 94°C for 10 s, 58°C for 30 s, 72°C 
for 30 s, and then 5 minutes at 72°C. The products were 
then held at 12°C. Following amplification, PGR products 
were gel selected and purified using the QIAquick gel 
extraction kit (Qiagen), and the purified PGR products 
were subcloned. The colonies from each region were 
sequenced on a 3730 genetic analyxer (Applied Biosys- 
tems) to determine the methylated cytosine levels. 



MeDIP followed by QPGR (see Additional data file 1 
for primer information) was performed on 300 to 400 
ng of original genomic DNA for each sample, which 
was randomly sheared to an average length of 200 to 
500 bp by sonication. A MeDIP assay was then per- 
formed using the Magnetic Methylated DNA Immuno- 
precipitation Kit (Diagenode, Liege, Belgium) according 
to the instructions. The immunoprecipitated products 
and 10% amount of original input DNA were purified 
with ZYMO DNA Glean & Goncentrator-5 kit (ZYMO) 
in parallel. The purified DNA was analyzed by QPGR 
on an ABI StepOne Plus Real Time PGR System 
(Applied Biosystems Inc.) using Eva Green (Biotium, 
Shanghai, Ghina). The relative methylation levels of 
particular genomic loci among samples were compared 
by measuring the amount of immunoprecipitated DNA 
after normalization to the 10% of input DNA: % 
(MeDNA-IP/Total input) = 2^[Gt(10%input) - 3.32 - 
Gt(MeDNA-IP)] x 100%. 

Data availability 

Sequence data and processed data are available under 
the Gene Expression Omnibus accession GSE39328. 
UPLG-MS/MS and BS-PGR data have been deposited in 
GigaDB, the GigaScience database, with the unique 
identifier doi:10.5524/100043 [15]. 

Additional material 



Additional file 1: Supplemental Tables SI to S6 and corresponding 
captions. 

Additional file 2: Information on all primer sequences, and 
Supplemental Figures SI to S6 and corresponding legends. 
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